Pattern-Guided k-Anonymity

نویسندگان

Robert Bredereck

André Nichterlein

Rolf Niedermeier

چکیده

We suggest a user-oriented approach to combinatorial data anonymization. A data matrix is called k-anonymous if every row appears at least k times—the goal of the NP-hard k-ANONYMITY problem then is to make a given matrix k-anonymous by suppressing (blanking out) as few entries as possible. Building on previous work and coping with corresponding deficiencies, we describe an enhanced k-anonymization problem called PATTERN-GUIDED k-ANONYMITY, where the users specify in which combinations suppressions may occur. In this way, the user of the anonymized data can express the differing importance of various data features. We show that PATTERN-GUIDED k-ANONYMITY is NP-hard. We complement this by a fixed-parameter tractability result based on a “data-driven parameterization” and, based on this, develop an exact integer linear program (ILP)-based solution method, as well as a simple, but very effective, greedy heuristic. Experiments on several real-world datasets show that our heuristic easily matches up to the established “Mondrian” algorithm for k-ANONYMITY in terms of the quality of the anonymization and outperforms it in terms of running time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy-preserving data mining: A feature set partitioning approach

In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for ...

متن کامل

Fast Data Anonymization with Low Information Loss

Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy preserving paradigms of k-anonymity and `-diversity. k-anonymity protects against the identification of an individual’s record. `-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches s...

متن کامل

Techniques for Content Subscription Anonymity with Distributed Brokers

When issuing a one-shot or continuous content-based subscription, there is an inherent tradeoff between the privacy of the subscriber and the accuracy of the matching notifications. The former can be described in terms of how well the exposed information uniquely characterized the subscriber, and the latter how well the returned data items match the subscriber’s real interests. In this paper, w...

متن کامل

k-Anonymous Patterns

It is generally believed that data mining results do not violate the anonymity of the individuals recorded in the source database. In fact, data mining models and patterns, in order to ensure a required statistical significance, represent a large number of individuals and thus conceal individual identities: this is the case of the minimum support threshold in association rule mining. In this pa...

متن کامل

Improved Univariate Microaggregation for Integer Values

Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Pattern-Guided k-Anonymity

نویسندگان

چکیده

منابع مشابه

Privacy-preserving data mining: A feature set partitioning approach

Fast Data Anonymization with Low Information Loss

Techniques for Content Subscription Anonymity with Distributed Brokers

k-Anonymous Patterns

Improved Univariate Microaggregation for Integer Values

عنوان ژورنال:

اشتراک گذاری